46 research outputs found

    Developing techniques for enhancing comprehensibility of controlled medical terminologies

    Get PDF
    A controlled medical terminology (CMT) is a collection of concepts (or terms) that are used in the medical domain. Typically, a CMT also contains attributes of those concepts and/or relationships between those concepts. Electronic CMTs are extremely useful and important for communication between and integration of independent information systems in healthcare, because data in this area is highly fragmented. A single query in this area might involve several databases, e.g., a clinical database, a pharmacy database, a radiology database, and a lab test database. Unfortunately, the extensive sizes of CMTs, often containing tens of thousands of concepts and hundreds of thousands of relationships between pairs of those concepts, impose steep learning curves for new users of such CMTs. In this dissertation, we address the problem of helping a user to orient himself in an existing large CMT. In order to help a user comprehend a large, complex CMT, we need to provide abstract views of the CMT. However, at this time, no tools exist for providing a user with such abstract views. One reason for the lack of tools is the absence of a good theory on how to partition an overwhelming CMT into manageable pieces. In this dissertation, we try to overcome the described problem by using a threepronged approach. (1) We use the power of Object-Oriented Databases to design a schema extraction process for large, complex CMTs. The schema resulting from this process provides an excellent, compact representation of the CMT. (2) We develop a theory and a methodology for partitioning a large OODI3 schema, modeled as a graph, into small meaningful units. The methodology relies on the interaction between a human and a computer, making optimal use of the human\u27s semantic knowledge and the computer\u27s speed. Furthermore, the theory and methodology developed for the scbemalevel partitioning are also adapted to the object-level of a CMT. (3) We use purely structural similarities for partitioning CMTs, eliminating the need for a human expert in the partitioning methodology mentioned above. Two large medical terminologies are used as our test beds, the Medical Entities Dictionary (MED) and the Unified Medical Language System (UMLS), which itself contains a number of terminologies

    Relationship auditing of the FMA ontology

    Get PDF
    The Foundational Model of Anatomy (FMA) ontology is a domain reference ontology based on a disciplined modeling approach. Due to its large size, semantic complexity and manual data entry process, errors and inconsistencies are unavoidable and might remain within the FMA structure without detection. In this paper, we present computable methods to highlight candidate concepts for various relation- ship assignment errors. The process starts with locating structures formed by transitive structural relationships (part_of, tributary_of, branch_of) and examine their assignments in the context of the IS-A hierarchy. The algorithms were designed to detect five major categories of possible incorrect relationship assignments: circular, mutually exclusive, redundant, inconsistent, and missed entries. A domain expert reviewed samples of these presumptive errors to confirm the findings. Seven thousand and fifty-two presumptive errors were detected, the largest proportion related to part_of relationship assignments. The results highlight the fact that errors are unavoidable in complex ontologies and that well designed algorithms can help domain experts to focus on concepts with high likelihood of errors and maximize their effort to ensure consistency and reliability. In the future similar methods might be integrated with data entry processes to offer real-time error detection

    A chemical specialty semantic network for the Unified Medical Language System

    Get PDF
    Background Terms representing chemical concepts found the Unified Medical Language System (UMLS) are used to derive an expanded semantic network with mutually exclusive semantic types. The UMLS Semantic Network (SN) is composed of a collection of broad categories called semantic types (STs) that are assigned to concepts. Within the UMLS’s coverage of the chemical domain, we find a great deal of concepts being assigned more than one ST. This leads to the situation where the extent of a given ST may contain concepts elaborating variegated semantics. A methodology for expanding the chemical subhierarchy of the SN into a finer-grained categorization of mutually exclusive types with semantically uniform extents is presented. We call this network a Chemical Specialty Semantic Network (CSSN). A CSSN is derived automatically from the existing chemical STs and their assignments. The methodology incorporates a threshold value governing the minimum size of a type’s extent needed for inclusion in the CSSN. Thus, different CSSNs can be created by choosing different threshold values based on varying requirements. Results A complete CSSN is derived using a threshold value of 300 and having 68 STs. It is used effectively to provide high-level categorizations for a random sample of compounds from the “Chemical Entities of Biological Interest” (ChEBI) ontology. The effect on the size of the CSSN using various threshold parameter values between one and 500 is shown. Conclusions The methodology has several potential applications, including its use to derive a pre-coordinated guide for ST assignments to new UMLS chemical concepts, as a tool for auditing existing concepts, inter-terminology mapping, and to serve as an upper-level network for ChEBI

    Single Endemic Genotype of Measles Virus Continuously Circulating in China for at Least 16 Years

    Get PDF
    The incidence of measles in China from 1991 to 2008 was reviewed, and the nucleotide sequences from 1507 measles viruses (MeV) isolated during 1993 to 2008 were phylogenetically analyzed. The results showed that measles epidemics peaked approximately every 3 to 5 years with the range of measles cases detected between 56,850 and 140,048 per year. The Chinese MeV strains represented three genotypes; 1501 H1, 1 H2 and 5 A. Genotype H1 was the predominant genotype throughout China continuously circulating for at least 16 years. Genotype H1 sequences could be divided into two distinct clusters, H1a and H1b. A 4.2% average nucleotide divergence was found between the H1a and H1b clusters, and the nucleotide sequence and predicted amino acid homologies of H1a viruses were 92.3%–100% and 84.7%–100%, H1b were 97.1%–100% and 95.3%–100%, respectively. Viruses from both clusters were distributed throughout China with no apparent geographic restriction and multiple co-circulating lineages were present in many provinces. Cluster H1a and H1b viruses were co-circulating during 1993 to 2005, while no H1b viruses were detected after 2005 and the transmission of that cluster has presumably been interrupted. Analysis of the nucleotide and predicted amino acid changes in the N proteins of H1a and H1b viruses showed no evidence of selective pressure. This study investigated the genotype and cluster distribution of MeV in China over a 16-year period to establish a genetic baseline before MeV elimination in Western Pacific Region (WPR). Continuous and extensive MeV surveillance and the ability to quickly identify imported cases of measles will become more critical as measles elimination goals are achieved in China in the near future. This is the first report that a single endemic genotype of measles virus has been found to be continuously circulating in one country for at least 16 years

    Robust estimation of bacterial cell count from optical density

    Get PDF
    Optical density (OD) is widely used to estimate the density of cells in liquid culture, but cannot be compared between instruments without a standardized calibration protocol and is challenging to relate to actual cell count. We address this with an interlaboratory study comparing three simple, low-cost, and highly accessible OD calibration protocols across 244 laboratories, applied to eight strains of constitutive GFP-expressing E. coli. Based on our results, we recommend calibrating OD to estimated cell count using serial dilution of silica microspheres, which produces highly precise calibration (95.5% of residuals <1.2-fold), is easily assessed for quality control, also assesses instrument effective linear range, and can be combined with fluorescence calibration to obtain units of Molecules of Equivalent Fluorescein (MEFL) per cell, allowing direct comparison and data fusion with flow cytometry measurements: in our study, fluorescence per cell measurements showed only a 1.07-fold mean difference between plate reader and flow cytometry data

    ConvertinganIntegratedHospitalFormularyintoan Object-OrientedDatabaseRepresentation

    No full text
    thetasksofinformationsharingandintegration, communicationamongvarioussoftwareapplica-tions,anddecisionsupport.ModelingaCMV asanObject-OrientedDatabase(OODB)provides additionalbenetssuchasincreasedsupportfor vocabularycomprehensionandexibleaccess.In thispaper,wedescribetheprocessofmodelingand convertinganexistingintegratedhospitalformu-lary(i.e.,setofpharmacologicalconcepts)intoan equivalentOODBrepresentation,which,ingen-eral,werefertoasanObject-OrientedHealthcare VocabularyRepository(OOHVR).Thesourcefor ourexampleOOHVRisaformularyprovidedby theConnecticutHealthcareResearchandEduca-tionFoundation(CHREF).Utilizingthissource formularytogetherwiththesemantichierarchy composedofmajorandminordrugclassesde-nedaspartoftheNationalDrugCode(NDC) directory,weconstructedaCMVthatwaseventu-allyconvertedintoitsOOHVRform(theCHREF-OOHVR).Theactualconversionstepwascarried outautomaticallybyaprogram,calledtheOOHV
    corecore